Faster Parsing by Supertagger Adaptation
نویسندگان
چکیده
We propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. The idea is to train the supertagger on large amounts of parser output, so that the supertagger can learn to supply the supertags that the parser will eventually choose as part of the highestscoring derivation. Since the supertagger supplies fewer supertags overall, the parsing speed is increased. We demonstrate the effectiveness of the method using a CCG supertagger and parser, obtaining significant speed increases on newspaper text with no loss in accuracy. We also show that the method can be used to adapt the CCG parser to new domains, obtaining accuracy and speed improvements for Wikipedia and biomedical text.
منابع مشابه
Forest-guided Supertagger Training
Supertagging is an important technique for deep syntactic analysis. A supertagger is usually trained independently of the parser using a sequence labeling method. This presents an inconsistent training objective between the supertagger and the parser. In this paper, we propose a forest-guided supertagger training method to alleviate this problem by incorporating global grammar constraints into ...
متن کاملThe Importance of Supertagging for Wide-Coverage CCG Parsing
This paper describes the role of supertagging in a wide-coverage CCG parser which uses a log-linear model to select an analysis. The supertagger reduces the derivation space over which model estimation is performed, reducing the space required for discriminative training. It also dramatically increases the speed of the parser. We show that large increases in speed can be obtained by tightly int...
متن کاملFaster parsing and supertagging model estimation
Parsers are often the bottleneck for data acquisition, processing text too slowly to be widely applied. One way to improve the efficiency of parsers is to construct more confident statistical models. More training data would enable the use of more sophisticated features and also provide more evidence for current features, but gold standard annotated data is limited and expensive to produce. We ...
متن کاملCombining Supertagging and Lexicalized Tree-Adjoining Grammar Parsing∗
In this paper we study various reasons and mechanisms for combining Supertagging with Lexicalized Tree-Adjoining Grammar (LTAG) parsing. Because of the highly lexicalized nature of the LTAG formalism, we experimentally show that notions other than sentence length play a factor in observed parse times. In particular, syntactic lexical ambiguity and sentence complexity (both are terms we define i...
متن کاملHidden Markov model-based supertagging in a user-initiative dialogue system
In this paper we outline the advantages of deploying a shallow parser based on Supertagging in an automatic dialogue system in a call center that basically leaves the initiative with the user as far as (s)he wants (in the literature called user– initiative or adaptive in contrast to system–initiative dialogue systems). The Supertagger relies on a Hidden Markov model and is trained with German i...
متن کامل